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This is a major revision of the manuscript http://arxiv.org/abs/1406.2816. We have 
significantly extended the numerical experiments, adding the comparison of cross 
algorithms, verification via the Monte Carlo method, computation of the exceedance 
probabilities, the log-normally distributed coefficient, and the systematic study of 
the performance w.r.t. different parameters of the stochastic PDE (correlation 
length, variance, etc.). Some unused material is removed. 

Abstract 

We apply the Tensor Train (TT) decomposition to construct the tensor product 
Polynomial Chaos Expansion (PCE) of a random field, to solve the stochastic ellip¬ 
tic diffusion PDE with the stochastic Galerkin discretization, and to compute some 
quantities of interest (mean, variance, exceedance probabilities). We assume that the 
random diffusion coefficient is given as a smooth transformation of a Gaussian random 
field. In this case, the PCE is delivered by a complicated formula, which lacks an an¬ 
alytic TT representation. To construct its TT approximation numerically, we develop 
the new block TT cross algorithm, a method that computes the whole TT decompo¬ 
sition from a few evaluations of the PCE formula. The new method is conceptually 
similar to the adaptive cross approximation in the TT format, but is more efficient 
when several tensors must be stored in the same TT representation, which is the case 
for the PCE. Besides, we demonstrate how to assemble the stochastic Galerkin matrix 
and to compute the solution of the elliptic equation and its post-processing, staying 
in the TT format. 

We compare our technique with the traditional sparse polynomial chaos and the 
Monte Carlo approaches. In the tensor product polynomial chaos, the polynomial 
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degree is bounded for each random variable independently. This provides higher accu¬ 
racy than the sparse polynomial set or the Monte Carlo method, but the cardinality 
of the tensor product set grows exponentially with the number of random variables. 
However, when the PCE coefficients are implicitly approximated in the TT format, 
the computations with the full tensor product polynomial set become possible. In the 
numerical experiments, we confirm that the new methodology is competitive in a wide 
range of parameters, especially where high accuracy and high polynomial degrees are 
required. 

Keywords: uncertainty quantification, polynomial chaos expansion, Karhunen-Loeve ex¬ 
pansion, stochastic Galerkin, tensor product methods, tensor train format, adaptive cross 
approximation, block cross 

MSC2010: 15A69, 65F10, 60H15, 60H35, 65C30 

1 Motivation 

Situations in which one is concerned with uncertainty quantihcation often come in the 
following guise: we are investigating physical models where inputs are not given precisely, 
but instead are random quantities or random helds, or depend on a set of parameters. A 
classical example is the Darcy flow model with a random diffusion coefficient, 

— Vk(x, cu)Vu(x, cu) = f(x, cu), X G D c (1) 

where d is the spatial dimension, k(x, cu) is a random held dependent on a random parame¬ 
ter cu in a probability space (O, A, 7). The solution u(x, cu) belongs to (D) w.r.t. x and 
the same probability space w.r.t. cu. There is an established theory about the existence and 
uniqueness of the solution to ([T]) under various assumptions on k and f; see, for example, 

PI EQ ESI IH HZ]. 

In pn 123] it is shown that under additional assumptions on the right-hand side f and 
special choices of the test space the problem ([T]) is well-posed. The case where the Lax- 
Milgram theorem is not applicable (e. g. upper and lower constants j<, 'kinO<j<< k< 
1< < oo do not exist) is also considered in jTT]. In |21j the authors analyze assumptions 
on K which were made in P| to guarantee the uniqueness and the existence of the solution 
and to offer a new method with much weakened assumptions. Additionally, in pi], the 
authors point out that after truncating terms in the expansion for k, as is done in im, it 
is not guaranteed that the truncated k will be strictly bounded from zero. As a result the 
existence of the approximate solution to ([T]) is questionable. The remarkable difference of 
the approach in |21j from approaches in PI HU is that the permeability coefficient k is not 
truncated and, as a result, the ellipticity condition is maintained. 

To solve (IT]) we need to discretize the random held k. This requires some knowledge on 
the probability space Q and the probability measure. A widely used approach relies on two 
assumptions: k is dehned as an invertible smooth function of another random held with 
known distribution (e.g. normal), and the covariance function of either of those helds is 
given. After that k(x, cu) can be expanded to a series of functions, depending on x and a set 
of new parameters 0 = (0i, 02,.. .)• Typically used are polynomials of 6 , hence called the 
Polynomial Chaos Expansion (PCE) [661167] family of discretizations. Another approach 
is the collocation on some grid in 0 [H |3]. Each of 0 is a random quantity depending 
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on to, but the domain of 0 is known deterministically. Introducing a discretization for 0, 
we turn the stochastic problem ([T]) into a high-dimensional deterministic one. However, 
its straightforward numerical solution suffers from the curse of dimensionality: even if 6 
contains a hnite amount of variables (say, M), the number of discrete degrees of freedom 
grows exponentially with M. 

To surmount this issue, a data-sparse approximation is needed. During the last years, 
low-rank tensor product techniques were successfully applied to the solution of high-dimensional 
stochastic and parametric PDEs. A recent literature survey of low-rank tensor approxima¬ 
tion techniques can be found in E21- The tensor product approach involves a format, in 
which the data are represented, and a set of algorithms to manipulate the data in the 
format. The algorithms can be roughly separated into three classes: methods performing 
basic approximate algebra (additions, products, simple iterative methods) of data within 
the format; methods constructing the formatted representation from a few entries of the 
initial data; and methods aiming to solve equations, e.g. linear systems, ODEs or eigenvalue 
problems, keeping all data in the format. 

To some extent, these methods have already been applied to parametric problems. Non- 
intrusive (black box) tensor methods for multi-parametric problems, i.e. “class 2”, were 
developed in |5l |T5]. In particular, in jl] the authors follow the stochastic collocation 
approach and compute functionals of the solution of multi-parametric PDEs. Since the 
stochastic collocation allows to solve uncoupled deterministic problems for different collo¬ 
cation points, the functional of the solution (e.g. the average value) can be approximated 
straightforwardly via the black box hierarchical tensor interpolation algorithm. To com¬ 
pute the whole stochastic solution is a more difficult problem, especially in the stochastic 
Galerkin framework, where deterministic problems are coupled. 

In [36l 1371 uni [60l |69] the authors develop iterative methods and preconditioners to solve 
numerically discretized multi-parametric problems. Several manipulations of the PCE with 
a low-rank approximation have been considered. In [11] the authors assume that the solution 
has a low-rank canonical (CP) tensor format and develop methods for the CP-formatted 
computation of level sets. In [l5l ITS] the authors analyzed tensor ranks of the stochastic 
operator. The proper generalized decomposition was applied for solving high dimensional 
stochastic problems in jH] [50]. In |331 ITT] [35] the authors employed newer tensor formats, 
the Tensor Train (TT) and Quantized TT (QTT), for the approximation of coefficients 
and the solution of stochastic elliptic PDEs. The theoretical study of the complexity of 
the stochastic equation was provided, for example, by means of the analytic regularity and 
(generalized) PC approximation [67] for control problems constrained by linear parametric 
elliptic and parabolic PDEs [38]. 

Other classical techniques to cope with high-dimensional problems are sparse grids [251 
uniiiB] and (quasi) Monte Carlo methods [26] EH EH] • Nevertheless, tensor product methods 
are more flexible than sparse grids, as they allow to avoid severe reductions of the model 
from the very beginning, and adapt a suitable structure on the discrete level. Compared to 
Monte Carlo methods, tensor techniques work implicitly with the whole solution, and even 
a construction of a tensor format for entry-wise given data in a black box manner uses less 
randomness than the MC approach. 

In this article we approximate the PCE of the input coefficient k(x, cn) in the TT tensor 
format. After that we compute the solution u(x, cn) and perform all post-processing in the 
same TT format. The hrst stage, computation of the PCE of k, involves a lengthy formula, 
dehning each entry of the discretized coefficient. To perform this computation efficiently. 
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we develop a new block cross approximation algorithm, which constructs the TT format for 
K from a few evaluations of the entry-wise formula. This formula delivers several tensors 
that are to be summed and approximated in a TT format. We show that the new algorithm 
is more efficient than several runs of a previously existing cross method IS7I for each tensor 
separately. As soon as the coefficient is given in the TT format, it becomes very easy to 
construct the stiffness matrix, derived from the stochastic Galerkin discretization of ©• 
We apply the alternating iterative tensor algorithm to solve a large linear system arising 
from ([T]), and hnally use the cross algorithm again to compute the exceedance probability 
from the solution. 

In the next section, we outline the general Galerkin, Polynomial Ghaos Expansion (PGE) 
and Karhunen-Loeve Expansion (KLE) discretization schemes for a random held. An intro¬ 
duction to the TT methods and the new block cross interpolation algorithm are presented in 
Section [3l Some details of how to apply the block cross algorithm to the PGE calculations 
are given in Section 14.11 We start with the TT approximation of the multi-dimensional 
input coefficient k. After that, in Section |T]2], we construct the stochastic Galerkin matrix 
in the TT format. Section ITTI is devoted to the efficient post-processing (computation of 
the mean value, covariance, and probability of a particular event) in the TT format. Nu¬ 
merical results in Section [5] demonstrate the practical performance of the TT approach in 
the outlined tasks. 


2 Discretisation and computation 

For brevity, we follow Ba, where more references may be found. See also the recent 
monograph jTO] to study more about KLE, PGE and multiindices. In |T71 IM| |63l [M] the 
authors discuss the stochastic Galerkin matrix, its sparsity and preconditioners. 

To discretize we follow the Galerkin approach. The Finite Element method (for 
example, with piecewise linear functions) is a natural way to discretize the spatial part. We 
choose a hnite dimensional subspace 


Vn = span{(pi(x),..(P n(x)} C V, (2) 

where V = (D) fl L^lD) is the Hilbert space of functions on x. For simplicity, we impose 

the homogeneous Dirichlet boundary conditions. 

Discretization in the probability space (O, A, 7) is less trivial. We use the Karhunen- 
Loeve Expansion (KLE) to determine a hnite set of independent parameters, dehning 
k(x, cu). However, the distribution of these parameters might be unknown, and special 
efforts are needed to resolve this. 

2.1 Discretization of the input random field 

We assume that k(x, cu) may be seen as a smooth transformation k = 4)(y) of the Gaussian 
random held y(x, cu). In this Section, we explain how to compute KLE of y if the covariance 
of K is given. For more details see [691 Sections 3.4.1, 3.4.2] or |29) . 

A typical example is the log-normal held with (t)(y) = exp(y). Expanding cf) in a series 
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of the Hermite polynomials gives 


+00 


4)(y] = ^ct3i^(y) ~ 4)i 


i=0 


i=0 


(\)iz)^hi[z]exp[-z^/2)dz, (3) 


where h,i(-) is the i-th Hermite polynomial, and Q is the nnmber of terms after the trunca¬ 
tion. 

The Gaussian held y(x, to) can be written as the Karhunen-Loeve expansion. First, 
given the covariance function of k(x, cu), we may relate it with the covariance function of 
y(x, cu] as follows (see details in |69l Sections 3.4.1, 3.4.2]), 


covK(x,y) = 


Cl 


- k(x)) [K(y,a)) - K(y)) dP(a)) ~ Y i!({)fcov(,(x,ij), (4) 


1=0 


where k(x] is the expectation of k(x, to). Solving this implicit Q-order equation, we derive 
coVy(x, y) [Sn]- Now, the KLE can be computed as follows: 

OO p 


y(x, cn) = gT,^(x)0,^(0)), where 


cov. 




lgm(h)dy = Anxgm(x] 


(5) 


m=l 


D 


Here we assume that the eigenfunctions gra absorb the square roots of the KL eigenvalues. 
The stochastic variables 0ra are normalized (they are uncorrelated and jointly Gaussian). 

The initial coefficient k depends nonlinearly on 0^^. In the discrete setting, we truncate 
PGE and write it for M random variables. 


k(x, cn) k(x, Q:)HQ,(0(cn)), where Hc,(0) (0i) ■ ■(6) 

oc£3m 


is the multivariate Hermite polynomial, o: = (ai,..., ajvi) is a multiindex (a tuple of 
multinomial orders), h,cc„,(0m) is the univariate Hermite polynomial, 0 = (0i,..., 0 m) is a 
tuple of random variables, and 3m is a set of all multinomial orders (see dehnition below). 
The Galerkin coefficients k(x, a) are evaluated as follows: 


k(x, a) 


(oci ■ 4- (Xm)! 

op! ■ ■ ■ Km! 


M 


4> 


ai- 




■+<^M 

m^l 


(7) 


where 4)a^+...+a^ is the Galerkin coefficient of the transform function in ([3]) and q°^[x) 
means just the ccm-th power of the KLE function value gmlx). 

In practice, we restrict the polynomial orders in ([6]) to hnite limits, which can be done 
in different ways. 

Definition 1. The full multi-index set is defined by restricting each component indepen¬ 
dently, 

3 m,p (^1 > • • • ) 0? • • • ) Pm; !>•••? A4} , 

where p — (pi,..., Pm) is a shortcut for the tuple of order limits. 
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The full set provides high flexibility for the resolution of stochastic variables [TSIIT^ I^. 
However, its cardinality is equal to + 1] < (p + 1)^, if Pm < p. This may become 

a enormous number even if p and M are moderate (p ~ 3, M ~ 20 is typical). In this paper, 
we do not store all (p + 1)^ values explicitly, but instead approximate them via a tensor 
product representation. 

Another approach is to preselect the set of polynomials with the moderate cardinality 

sa El H]. 


Definition 2. The sparse multi-index set is defined by restricting the sum of components, 

^M,p : a > 0, (Xi H-haM<p}. 

The sparse set contains 0 — 0(MP) values if M 3> p, which is definitely less 

than (p + 1 However, the negative side is that for a fixed p some variables are resolved 
worse than others, and the approximation accuracy may suffer. It may also be harmful to 
increase p since in the sparse set it contributes exponentially to the complexity. 

The low-rank tensor approximation allows to reduce the storage of the coefficient with 
the full set 3m, p from 0(p*^) to 0(M.pr^), where r is a data-dependent tensor rank. The¬ 
oretical estimates of r are under development; numerical studies reflect that often r does 
not depend on p and depends linearly (or even milder) on M. [33]. When M is moderate, 
and p is relatively large, the low-rank approach with the full index set (Def. [T]) becomes 
preferable. Besides, as soon as the coefficient k is computed in a tensor product form, to 
assemble the Galerkin matrix of the stochastic PDE is a much easier task than with the 
sparse set. 

Some complexity reduction in Formula d?]) can be achieved with the help of the KLE 
for the initial field k(x, cu). Consider the expansion 


oo L 

k(x, cu) = k(x) \Ahv«(x)p£(cu) cs k(x) 

£=1 l=\ 


( 8 ) 


where V£(x) are eigenfunctions of the integral operator with the covariance as the kernel 
(see e.g. |691 |29l E]). It’s hard to work with ([8]) straightforwardly, since the distribution 
of r\i is generally unknown. But we know that the set V(x) = {^{(x)}^^!, where L is the 
number of KLE terms after the truncation, serves as an optimally reduced basis. Therefore, 
instead of using ([7]) directly, we project it onto V(x): 


K{(a) 


,0Ci 


cxm)! 


(Xi! ■ ■ ■ (Xm! 


4> 


OCi H-hOCjvi 


M 

n 

m=l 


gS"(’‘)'’iWdx. 


(9) 


Note that the range f = 1,..., L may be much smaller than the number of discretization 
points N in D. After that, we restore the approximate coefficients (j71) : 


L 

k(x, fx) k(x)^V{( x)k«(q:). (10) 

i=\ 
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2.2 Discretization of the stochastic operator equation 

The same PCE ansatz of the coefficient (|6]) may be adopted to discretize the solution u 
and ultimately the whole initial problem ([T]), see [T8l fT9] . For brevity, we illustrate the 
procedure on the deterministic right-hand side f = f(x). 

Given the KLE components ([H]) and the spatial discretization basis (|2]), we hrst assemble 
the spatial Galerkin matrices, 




K(x]V(pi(x) ■ V(pj(x]dx, 

D 


Vf(x)V(pi(x) ■ V(pj(x)dx, 

D 


( 11 ) 


for 1 , j = 1,..., N, f = 1,..., L. Now we take into account the PGE part Kq. Assuming that 
u is decomposed in the same way as ([S]) with the same 3m, p or we compute the integral 

in (IT^ over stochastic coordinates 9 and compute the stochastic parts G M#3MX#gM 
of the Galerkin matrix as follows (see also 123 EE S3]), 


k[^\cx,I3) 


Y_ U^)^.[Q)9[0)die = Y_ ( 12 ) 


where p[9) = p[0i) ■ ■ • p(0)vi) is the probability density with p[0Ta) = exp(—0^/2), and 




= A 




...A 




^ttdPti 


H«,.(0)Hp^(0)H,,,(0)p(0)d0, 

R 


(13) 


is the triple product of the Hermite polynomials, and Utiu) is according to ([9]). Let us 
denote Ao[ct,f3) = Aa^,|3,,o • ■ ■ i-O- Aq G Putting together ([lO]), 

and ffT^ . we obtain the whole discrete stochastic Galerkin matrix, 

K = k'"’®«)(14) 
{=1 

which is a square matrix of size N ■ and (8) is the Kronecker product. 

For the sparse index set, we need to compute 0 entries of A explicitly. For 

the full index set and given in the tensor format, the direct product in A ffT3|) allows 
to exploit the same format for (ITT)) and to simplify the procedure, see Sec. 14.21 
The deterministic right-hand side is extended to the size of K easily. 


f = /o G eo, 


/o(l) 


(pt(x)f(x)dx, i = 

D 


(15) 


and Co is the hrst identity vector of size — (1 > 0) • • • > 0)"'') which assigns the deter¬ 

ministic f(x] to the zeroth-order Hermite polynomial in the parametric space. 


2.3 Solution of the stochastic equation 

Now the discrete equation writes as a linear system 

Ku = f, (16) 
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where u G jg vector of the Galerkin coefficients for the solution, with elements 

enumerated by u(i, ck). 

In [60] the authors propose two new strategies for constructing preconditioners for the 
stochastic Galerkin system to be used with Krylov subspace iterative solvers. The authors 
also research the block sparsity structure of the corresponding coefficient matrix as well as 
the stochastic Galerkin matrix. In [631IM] the authors develop and analyze a Kronecker- 
product preconditioner. 

fxl 

In many cases it is sufficient to use the mean-held preconditioner Kq ® which is 
easy to invert due to the Kronecker form. We follow this approach. To solve the system in 
a tensor product format, we employ alternating optimization methods m- 

3 Tensor product formats and low-rank data compres¬ 
sion 

We see that the cardinality of the full polynomial set 3m, p may rise to prohibitively large 
values (p + 1 )^. In this paper, we study two ways to alleviate this problem. First, we can hx 
the basis set a priori. This is the case with the sparse set 3 m p- Due to particular properties 
of the stochastic elliptic equation, it is possible to derive a posteriori error indicators and 
build the sparse set adaptively [T6l fTTj . 

Another approach, which is applicable to a wider class of problems, is to use the full dis¬ 
cretization set, but to approximate already discrete data, via a low-rank tensor product for¬ 
mat. For stochastic PDFs, low-rank approximations were used in e.g. [HlElElEslsnilso]. 
This approach requires specihc computational algorithms, since the data are represented in 
a nonlinear fashion. In this section we suggest such an algorithm to construct a data-sparse 
format of the stochastic coefficient and quantities of interest. 

3.1 Tensor Train decomposition 

To show the techniques in the briefest way, we choose the so-called matrix product states 
(MPS) formalism [20], which introduces the following representation of a multi-variate 
array, or tensor: 


ri rz Tjvi-i 

S] =1 S2 = l Sjvi-1 =1 

Surely, in the same form we may write /^(q;]. In numerical linear algebra this format is 
known as tensor train (TT)” representation [52] [5l]. Each TT core (or block) is a three- 
dimensional array, G ^Tm+i)xTm^ m = 1, ..., M, where denotes the mode size, 
the polynomial order in the variable ara, and = Tmlu) is the TT rank. The total number 
of entries scales as O(Mpr^), which is tractable as long as r = max{rai} is moderate. 

We still have not specihed the border rank indices Sq and Sm- In the classical TT 
definition [52] and in iHB, there is only one tensor u(q:) in the left-hand side, therefore 
So = Sm = 1 , and also Tq = Tm = 1 . However, the left-hand side might contain several 
tensors, such as (E])- Then we can associate Sq = f or Sm = f, and approximate all 

tensors for different f in the same shared TT representation. 


8 


High-dimensional matrices (cf. K in can be also presented in the TT format, 

K= L ® ■ ■ ■ ®-ffr, ■ 

So,.--)SM-1 

The matrix by vector product Ku is then recast to TT blocks of K and u. Similarly, using 
the multilinearity of the TT format, we can cast linear combinations of initial tensors to 
concatenations of TT blocks. 

The principal favor of the TT format comparing to the Canonical Polyadic (CP) decom¬ 
position (which is also popular, see e.g. |TH1 is a stable quasi-optimal rank reduction 
procedure [52], based on singular value decompositions. The complexity scales as 0(Mpr^], 
i.e. it is free from the curse of dimensionality, while the full accuracy control takes place. 
This procedure can be used to reduce unnecessarily large ranks after the matrix by vector 
product or the linear combination in the TT format, and to compress a tensor if it is fully 
given as a multidimensional array and hts into the memory. However, in our situation this 
is not the case, and we need another approach to construct a TT format. 

3.2 Cross interpolation of matrices 

If M = 2, the left-hand side of flT7|) can be seen as a matrix. For simplicity we consider this 
case hrst. The basic assumption is that any entry of a matrix (or tensor) can be evaluated. 
However, to construct a TT approximation, we do not want to compute all elements, but 
only few of them. 

The principal ingredient for this is based on the efficiency of an incomplete Gaussian 
elimination in an approximation of a low-rank matrix, also known as the Adaptive Cross 
Approximation (ACA) [6l [7]. Given is a matrix U = [C/(i, j)] G we select some 

“good” columns and rows to approximate the whole matrix, 

C/(i,j) ^ C/(i,j) = U[iJ] ■ U-^ ■ C7(I,j), (18) 

where U = C/(I, J), and I C {1,..., p}, J C {1,..., q} are sets of indices of cardinality r, 
so e.g. C/(i, J) G It is known that there exists a quasi-optimal set of interpolating 

indices I, J. 

Lemma 3 (Maximum volume [maxvol) principle |25| ). If I and J are such that det C/(I, J) 
is maximal among all r x r submatrices of U, then 

||t/-^||c < (r+l) min \\U-VWi, 

rank(V’)=r 

where || • ||c is the Chebyshev norm, ||2C||c = maxy |A^y|. 

In practice, however, the computation of the true maximum volume submatrix is in¬ 
feasible, since it is an NP-hard problem. Instead one performs a heuristic iteration in an 
alternating fashion [21]: we start with some (e.g. random) low-rank factor G RT’^^, 
determine indices I yielding a quasi-maxvol r x r submatrix in and compute as r 
columns of U of the indices I. Vice versa, in the next step we hnd quasi-maxvol column in¬ 
dices in and calculate corresponding pr elements, collecting them into the newer 
which hopefully approximates the true low-rank factor better than the initial guess. This 
process continues until the convergence, which appears to be quite satisfactory in practice. 
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3.3 TT-cross interpolation of tensors 

In higher dimensions we recurrently proceed in the same way, since the TT format con¬ 
stitutes a recurrent matrix low-rank factorization. Let us merge the first m and the last 
M — m indices from a. The corresponding TT blocks will induce the following matrices: 

Definition 4. Given a TT format flT7|) and an index m. Define the left interface matrix 
G Mhi+i)-"(Pm-i+i)xT„i_, right interface matrix G R^’^^hm+i+b-'-tpM+i) 

follows, 


T] 


^m—2 




Si —1 SrrL—2 — ^ 

■fm+l T'M-I 


t^Sra (-^Ta+1 ) • • • ) y ■ ■ ■ y 


u 


(m+1) 

Sm^Sm-i-l 




Sm+1—1 Sm-1—1 


Such matrices are convenient to relate high-dimensional TT expressions with their two- 
dimensional analogs. For example, the TT format flT7|) can be written in the following 
form, 

u(q:) ^ (Xnx+l(X m). (19) 

The TT-cross algorithm |55] assumes that the above expansion is valid on some tra-i (pm + 
1 indices, and can thus be seen as a system of equations on elements of Let us be 
given r^a-i left indices (&i,..., &ra-i) ^ right indices (&ra+i > • • • > ^ 

For the single index cXra we allow its full range { 0 ,^} = {0,1,..., pra}- Requiring that ffT^ 
is valid on the combined indices 


we obtain the following computational rule for 

= ^<1 ■ u (X,,, (20) 

where C/<m = gnO f7>,Ti = are the submatrices of and 

at indices gj^o respectively. 

Of course, these submatrices must be nonsingular. In the previous subsection we saw 
a good strategy: if indices g^o are chosen in accordance with the maxvol 

principle for and the submatrices are not only nonsingular, but also provide 

a good approximation for ffT^ . However, in practice and are too large to be 

treated directly. Instead, we use the alternating iteration over the dimensions and build 
nested index sets. 

The alternating iteration means that we loop over m = 1,2,..., M (the so-called forward 
iteration) and m = M, M — 1,..., 1 {backward iteration). Consider first the forward transi¬ 
tion, m—1 —> m. Given the set nested indices are defined as follows. We con¬ 

catenate l[(’^~b gj^(j i g_ consider r,Ti-i (Pm + I) indices (&i,..., &m,-b <^m) ^ 
and select indices only from fpom all 0(p’^) possibilities. It can be 

seen that the next interface , restricted to oCmf, can be computed as a product 

of the previous submatrix C/<ra and the current TT block To formalize this, we need 
the following definition. 
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Definition 5. Given a three-dimensional tensor G x(pm+i)xTm^ introduce the 

following reshapes, both pointing to the same data stored in 

• left folding; v}^'> Snx] = e M’'m-dPm+i)xTn,^ 

• right folding; (Xm, Snx) ^ u^Z-^,sn.^o^nl]y G R^m-ix(pn,+i)rn,_ 

Then the restriction of writes as follows, 

VH = and oc^) = G M’-m-dPm+i)xr,^_ 

Thus, it is enough to apply the maximum volume algorithm m to the matrix , deriving 
local maxvol indices C {1,.. + 1)}, and obtain both and C/<Ta+i by the 

restriction 

I(m) ^ C/<^+i = (to e 

The backward transition m + 1 —> m for and C/>ra can be written analogously. 
We show it directly in Alg(T] below. In total, we need only ©(n^tMpr^) entries of u to be 
evaluated, where is the number of alternating iterations, typically of the order of 10. 


3.4 Rank-adaptive DMRG-cross algorithm 

A drawback of the TT-cross method is that the TT ranks are fixed; they have to be guessed 
a priori, which is also a problem of exponential complexity in M. A remedy is to consider 
larger portions of data in each step. The Density Matrix Renormalization Group (DMRG) 
algorithm was developed in the quantum physics community ([SS], see also the review [SB] 
and the references therein) to solve high-dimensional eigenvalue problems coming from the 
stationary spin Schroedinger equation. It is written in a similar alternating fashion as the 
TT-cross procedure described above. The crucial difference is that instead of one TT block 
as in fl2n|) we calculate two neighboring TT blocks at once. 

In the DMRG-cross |S7| interpolation algorithm, this is performed as follows. Given 
we compute 


Then we need to separate indices cXra and (Xra+i to recover the initial TT structure. This 
can be done via the singular value decomposition. The four-dimensional array 
is reshaped to a matrix G M’'ni-dT’'^+b^hm+i+i)T,n+i ^ truncated SVD is 

computed. 




where V G g W G Mhm+i+i)Tni+, || . |||. jg ^pg Probenius 

norm. The new TT rank is likely to differ from the old one Ttti. After that, we rewrite 

and with V and W, respectively, replace =^m.y and continue the iteration. 

To ensure that the perturbations introduced to and the whole tensor u coincide, 

we need to ensure that the interfaces and have orthonormal columns and rows, 

respectively. Fortunately, this requires only small-sized QR decompositions of matrices 
and [52l [58] in the first iteration. Later, the SVD will provide orthogonal factors in 
^{m.) gutomatically. 
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This algorithm allows a fast adaptation of TT ranks towards the requested accuracy 
threshold. The price is, however, a larger degree of p in the complexity, since we perform 
a full search in both am and am+i in each step. The greedy-cross method |56) avoids 
this problem by maximizing the error over only 0(rp) random entries among all O(r^p^) 
elements in Here, instead of the neighboring block jg error that 

provides additional information and improves the approximation. However, for the KLE- 
PCE coefficient ([2]), we have other candidates for such auxiliary data. And we have reasons 
to consider them prior to the error. 

3.5 Block TT-Cross interpolation algorithm 

Note that each call of dH]) throws L values, corresponding to different f = 1,..., L. We 
may account for this in various ways. Since i has the meaning of the reduced spatial 
variable, we may feature it as an additional dimension. But when we restrict the indices 

we will remove some values of ^ from consideration. Therefore, a 
vast majority of information cannot be used: we evaluate L values, but only a few of them 
will be used to improve the approximation. Another way is to run L independent cross (e.g. 
DMRG-cross) algorithms to approximate each Kf(Q:) in its own TT format. Yet, this is also 
not very desirable. First, the TT ranks for the whole k, are usually comparable to the ranks 
of individual TT formats. Therefore, L cross algorithms consume almost L times more time 
than the single run. Secondly, we will have to add L TT formats to each other by summing 
their ranks, which is to be followed by the TT approximation procedure. The asymptotic 
complexity thus scales as which was found to be too expensive in practice. 

A better approach is to store all in the same TT representation, employing the 

idea of the block TT format PI- The resulting method has a threefold advantage: all 
data in each call of ([9]) are assimilated, the algorithm adjusts the TT ranks automatically 
according to the given accuracy, and the output is returned as a single optimally-compressed 
TT format, convenient for further processing. 

Assume that we have a procedure that, given an index ai,...,ajvi, throws L values 
Uf (ck), i — 1,..., L. When the tensor entries j evaluated, we modify 

fl2n|) as follows: 

= ( 21 ) 

Now is a four-dimensional tensor, in the same way as in the DMRG-cross. We need 
to hnd a basis in ct that is best suitable for all Uf. Hence, we reshape to the matrix 
g ]^rn,_,(pm+i)xLTm compute the truncated singular value decomposition 

^ ^ i: G W (22) 

Again, the new rank satishes the Frobenius-norm error criterion, and replaces for the 

next iteration. In the backward iteration, we reshape to G M’-’’™-'^hm+i)Tm 
take the right singular vectors to the new TT block, 

Vk G i; G ^{m\ (^23) 

The whole procedure is summarized in the Block TT-Cross Algorithm [H 
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Algorithm 1 Block cross approximation of tensors in the TT format 
Require: A function to evaluate U£(ai,..., aivi), initial TT guess 
relative accuracy threshold e. 

Ensure: Improved TT approximation 

1 : Initialize = D, = 0, C/>m = 1. 

2: for iteration = 1,2,..., n^t or until convergence do 
3: for m = 1,2,..., M. — 1 do {Forward sweep} 

4: if iteration > 1 then {All indices are available, assimilate the information} 

5: Evaluate the tensors at cross indices and compute the common block by fl2ip . 

6: Compute the truncated SVD (1221) with accuracy e. 

7: else {Warmup sweep: the indices are yet to be built} 

8: Find QR decomposition = I. 

9: Replace = q^'^K 

10: end if 

11: Compute the pre-restricted interface 

12: Find local maxvol indices ira = maxvol 

13: New indices p^^} interface ^ 

14: end for 

15: for m = M, M — 1,..., 2 do {Backward sweep} 

16: Evaluate the tensors at cross indices and compute the common block by fl2T11 . 

17: Compute the truncated SVD fl23ll with accuracy e. 

18: Compute the pre-restricted interface 

19: Find local maxvol indices = maxvol 

20: Restrict = {anx, (Tm), ^>m-i = ^ R’'— 

21: end for 

22: Evaluate the hrst TT block = U{ (ai, 

23: end for 


4 TT-structured calculations with PCE 

4.1 Computation of the PCE in the TT format via the cross inter¬ 
polation 

Equipped with Alg. [T], we may apply it to the PCE approximation, passing Formula ([9]) as 
a function Uf(Q:) that evaluates tensor values on demand. The initial guess may be even a 
rank-1 TT tensor with all blocks populated by random numbers, since the cross iterations 
will adapt both the representation and TT ranks. 

Considering the sizes of the involved matrices, the complexity estimate can be written 
straightforwardly. 

Statement 6. The cost to compute via the block TT-Cross Algorithm{^is 

0(r^p(MN -I- Nh) -I- PpL -|- r^pL ■ min{p, L}). 

The hrst two terms come from Formula ([2]), and the last one is the complexity of SVDs 
(|22|1 and (123]). 
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Remark 7. It is unclear in general which term will dominate. For large N, we are 
typically expecting that it is the evaluation 021 p . However, i/N is moderate (below lOOOj, 
but the rank is large lOOj, the SVD consumes most of the time. For the whole algorithm, 
assuming also L ~ M, we can thus expect the 0(nitM.^Npr^) complexity, which is lower 
than which we could receive if we run independent cross algorithms for each 

As soon as the reduced PCE coefficients ikiict) are computed, the initial expansion OTop 
comes easily. Indeed, after the backward cross iteration, the I index belongs to the hrst TT 
block, and we may let it play the role of the “zeroth” TT rank index, 

= Y_ (24) 

S] 


For £ = 0 we extend this formula such that ko[cx.) is the hrst identity vector eo, cf. 
Now we collect the spatial components from (|8]) into the “zeroth” TT block. 






J {=0 


[k(x) Vi I 


vl(x)] , 


then the PCE ([6]) writes as the following TT format. 


K(x,a)= Y_ 

£,si 




(na. 

(25) 


(26) 


4.2 Stiffness Galerkin operator in the TT format 

With the full set 3m, p, we may beneht from the rank-1 separability of since each index 
(Xtr) Pm) x^m varies independently on the others. 

Lemma 8. Given the PCE TT format fl26p for the coefficient k with the TT ranks t(K), 
the Galerkin operator d can be constructed in the TT format with the same ranks. 

Proof. Given the PCE (12^ in the TT format, we split the whole sum over u in (IT^ to the 
individual variables. 


(27) 

Pm) = TU = 1, . . . , M. 

Vm=0 


A similar reduction of a large summation to one-dimensional operations arises also in quan¬ 
tum chemistry [SO]. Agglomerate from fllip to the “zeroth” TT block for the 

operator, then the TT representation for the operator writes with the same TT ranks as in 

K = Y ® ® ® e (^28) 

t!,si 


□ 
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One interesting property of the Hermite triples is that ^ = 0 if e.g. a/ > a + |3. 
That is, if we set the same p for a, (3 and 'v, in the assembly of ffTT|) we may miss some 
components, corresponding to a > p/2, |3 > p/2. To compute the Galerkin operator 
exactly, it is reasonable to vary Vra in the range {0, ..., 2 p}, and hence assemble k in the 
set 3m,2p- While in the sparse set it would inflate the storage of and K signihcantly, in 
the TT format it is feasible: the TT ranks do not depend on p, and the storage grows only 
linearly with p. 


4.3 Computation of the solution function 

Having solved the system 0161) . we obtain the PCE coefficients of the solution in the TT 
format, 

u(x,q:)= ^ (29) 

SOvjSM-I 

Some statistics are computable directly from u(x, ck), but generally we need a function in 
the initial random variables, u(x, 0). Since 3m, p is a tensor product set, u(x, ck) can be 
turned into u(x, 0} without changing the TT ranks, similarly to the construction of the 
Galerkin matrix in the previous subsection; 


u[x,0}= y 


So,.-,Sm-1 





S0)Sl 


WI, 


L' 


‘•Km 




(30) 


4.4 Computation of statistics 

In this section we discuss how to calculate some statistical outputs from the solution in the 
TT format, such as the mean, the (co)variance and the probability of an event. 

The mean value of u, in the same way as in k, can be derived as the PGE coefficient at 
a = (0,..., 0), u(x) = u(x, 0,..., 0). It requires no additional calculations. 

The covariance is more complicated and requires both multiplication (squaring) and 
summation over ot. By dehnition, the covariance reads 


coVu(x,p) 


(u(x, 0) - u(x)) (u(p, 0) - u(p)) q[0)^0 


= Y _ u(x,q:)u(p,/3) 


Hc.(6>)H;3(6>)p(6>)d6>. 


Knowing that J Ha(0)H^(0)p(0)d0 = we take ^ 5 ^( 0 :) = 'uio'(ai) ■ ■ ■ nf'^^((Xjvi) 

from fl29|) and multiply it with the Hermite mass matrix (TT ranks do not change), 

mjja) Ws„(q:)\/^= Y _ , (31) 

S] 


and then take the scalar prodnct C — s'], where — (rUso, rCs') with w defined in 

ca. Given the TT rank bound r for u(x, ck), we deduce the O(Mpr^) complexity of this 
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step. After that, the covariance is given by the product of C with the spatial TT blocks, 


To 


cov. 




o^s' 


(i)), 


(32) 


So,s'=0 


where is the “zeroth” (spatial) TT block of the decomposition fl2^ . Given N degrees 
of freedom for x, the complexity of this step is Note that a low-rank tensor 

approximation of a large covariance matrix is very important, for example, in Kriging isa. 
The variance is nothing else than the diagonal of the covariance, var^lx) = coVu(x, x). 

Other important outputs are the characteristic, level set functions, and the probability 
of a particular event |19] . 

Definition 9. Let § C M &e a subset of real numbers. 

• The characteristic function ofu at §> is defined pointwise for all 6 G as follows, 


XsiXyO) 


1 , u(x, 6) e §, 
0, u(x, 9) ^ S. 


(33) 


• The level set function reads /C§(x, 0] u(x, 

• The probability o/§ reads = Jrm 

The characteristic function can be computed using either the cross Algorithm [T] (see 
also 01®. which takes Formula flH^ as the function that evaluates a high-dimensional 
array x at the indices in x,0, or the Newton method for the sign function [19]. In both 
cases we may face a rapid growth of TT ranks during the cross or Newton iterations; the 
characteristic function is likely to have a discontinuity that is not aligned to the coordinate 
axes, and some of the singular values in the TT decomposition will decay very slowly. We 
face the same problem with increasing ranks during computing the level set functions and 
exceedance probabilities. 

However, the probability is easier to compute, especially if it is relatively small. Using 
the same cross algorithm, we can compute directly the product )^§(x, 0) = Xs(^)^)p(^)- 
The probability (at a hxed x) is then computed as a scalar product with the all-ones vector 
in the TT format, Ps(x) = (5^, 1). But if P is small, it means that most of the entries in 
are small, and do not inflate TT ranks, which might be the case for x- Typically, the 
computation of small probabilities is used to predict the failure risk of a technical system. 
The event set has the form S = {z G M : z > t}, and the probability is called the exceedance 
probability. This will be studied in numerical experiments. 


5 Numerical Experiments 

We verify the approach on the elliptic stochastic equation ([T]) in a two-dimensional h-shape 
domain, x = (xijXi) G D = [—1,1]^\[0,1]^. We pose zero Dirichlet boundary conditions 
and use the deterministic right-hand side f = f(x) = 1 . The stochasticity is introduced in 
the diffusion coefficient k(x, cu); we investigate log-normal and beta distributions for k. 

To generate the spatial mesh, we use the standard PDF Toolbox in MATLAB. We 
consider from 1 to 3 rehnement levels of the spatial grid, denoted by R. The hrst rehnement 
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R = 1 yields 557 degrees of freedom, R = 2 gives 2145 points, and R = 3 corresponds to 
8417 points. Since we have to store the stiffness matrices in a dense form, we cannot rehne 
the grid further. 

The KLE of both k and y is truncated to the same number of terms L = M. 

All utilities related to the Hermite PCE were taken from the sglib [68], including dis¬ 
cretization and solution routines in the sparse polynomial set dMp- However, to work with 
the TT format (for full Sm,p), we employ the TT-Toolbox [53]. The same polynomial orders 
are chosen in all variables, p = (p,... ,p). We use the modules of sglib for low-dimensional 
stages and replace the parts corresponding to high-dimensional calculations by the TT al¬ 
gorithms. The block TT-cross Alg. [Tjis implemented in the procedure amen_cross .m from 
the TT-Toolbox, and the linear system f[T3]) was solved via the Alternating Minimal Energy 
(AMEn) method [H], the procedure amen_solve.m from the companion package tAMEn 
na. Computations were conducted in MATLAB R2012a on a single core of the Intel Xeon 
X5650 CPU at 2.67GHz, provided by the Max- Planck-Institute, Magdeburg. 

The accuracy of the coefficient and the solution was estimated using the Monte Carlo 
method with simulations. We approximate the average L 2 -error as follows. 






L 

Z—] 






|k(x,0) - k,(x,0)||l2[d) 
||k*(x,0)||l2(d) 


p( 0 )d 0 , 


where are normally distributed random samples, are the spatial grid points, 

and K*(xi,0z) = 4^ (ytxij^z)) is the reference coefficient computed without using the PCE 
for 4^. The same dehnition is used for the solution u, with u*(x, 0^) being the solution of 
the deterministic PDE with the coefficient k* (x, Oj] . 

Besides that we compare the statistics obtained with our approaches and the Monte 
Carlo method. For the mean and variance we use the same discrete approximation to the 
h2-norm, 

T- _ llh-— 14*1112(0] p _ ||varu — varu*||L2(D) 

IIW||l2(d) ||varu.*||L2(D) 

To examine the computation of probabilities, we compute the exceedance probability. 
This task can be simplified by taking into account the maximum principle: the solution 
is convex w.r.t. x. We compute the maximizer of the mean solution, x^ax • h-lXmax) ^ 
u(x) Vx G D. Fix X to Xmax and consider only the stochastic part Umaxl^) = ■^.(Xmax^^), 
and fl = u(Xmax)- Now, taking some t > 1 , we compute 


P = P(Umax(^) > TTR) = 


Xu,„,.(0)>T<i(6>)p(6>)d6>. 


(34) 


By P* we will also denote the probability computed from the Monte Carlo method, and 
estimate the error as Ep = |P — P*| /P*. 

5.1 Log-normal distribution 

Let 

k(x, cu] = exp I 1 -P ff- 2 - 1+10, 
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Table 1: Performance of the block cross Alg. [T] versus the DMRG cross EU 


M 

Block cross 
GPU time, sec. 

tK 

DMRG crosses 
GPU time, sec. Tk 

1 KdmRG ~ KBlockl 

1 KBlockl 

10 

4.908 

20 

31.29 

20 

2.77e-5 

15 

10.36 

27 

114.9 

27 

2.24e-5 

20 

19.11 

32 

286.2 

33 

1.83e-4 

30 

49.89 

39 

1372.0 

50 

2.52e-4 


where y ~ !N(0,1) is the standard normally distributed random held. The covariance func¬ 
tion is taken Gaussian, coVk(x, y) = exp )’ ''^here h is the (isotropic) correlation 

length. 

The default parameters are the following; number of KLE terms M. = 20, polynomial 
order p = 3, correlation length G = 1, dispersion 0 = 0.5, rehnement level of the spatial 
grid R = 1, and the tensor approximation accuracy e = lO^'^. Below we will vary each of 
these parameters, keeping the others hxed. For the computation of the probability fl34p we 
use r — 1.2. 

5.1.1 Verification of the block cross algorithm 

Formula ([9]) can be evaluated for each KLE index f independently, using existing cross ap¬ 
proximation algorithms. We compare with the so-called DMRG cross method EH, which 
is conceptually the closest approach to our Algorithm [T] In Table [T] we show the perfor¬ 
mance of the single run of Algorithm [T] (which gives the coefficient for all f simultaneously) 
and of L DMRG crosses, followed by the summation of individual terms to the common 
representation. We see that even if the TT ranks of the output are exactly the same, the 
times differ dramatically. This is because the ranks of individual components and of the 
whole coefficient are comparatively the same, and the DMRG approach requires roughly L 
times more operations. The last column in Table [T] con&ms that both approaches deliver 
the same data up to the approximation tolerance. 

5.1.2 Experiment with the polynomial order p 

First, we provide a detailed study of the computational time of each of the stages in the 
TT and Sparse methods: construction of the coefficient (Tk), construction of the operator 
(Top), and the solution of the system (To.). The results are shown in Table [21 and times are 
measured in seconds. The complexity of the cross algorithm, employed for the computation 
of K in the TT format, grows linearly with p, since the TT ranks are stable w.r.t. p (see 
Table |3]). However, it is much slower than the direct evaluation of the coefficients in the 
sparse set. This is mainly due to the singular value decompositions, involving matrices of 
sizes hundreds. 

Nevertheless, for the computation of the Galerkin matrix the situation is the opposite. 
In the TT format, the computations via formula (127)) are very efficient, since they involve M 
products of p^ X 2p matrices. In the sparse representation, we have to perform all 
operations, which is very time-consuming. Since grows exponentially with p, we had 

to skip the cases with large p. 
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Table 2: Detailed CPU times (sec.) versus p, log-normal distribution 



TT (full 

index set 3m,p) 

Sparse 

(index set ^^ ) 

p 

Tk 

Tap 

Tu 

Tk 

Top 

Tu 

1 

9.6166 

0.1875 

1.7381 

0.4525 

0.2830 

0.6485 

2 

14.6635 

0.1945 

2.9584 

0.4954 

3.2475 

1.4046 

3 

19.1182 

0.1944 

3.4162 

0.6887 

1027.7 

18.1263 

4 

24.4345 

0.1953 

4.2228 

2.1597 

— 

— 

5 

30.9220 

0.3155 

5.3426 

9.8382 

— 

— 




Table 3: Performance versus p, log-normal c 

istribution 


P 

CPU time. 

sec. 

tK 

ru 


Ek 

E, 

1 

P 


TT 

Sparse 



TT Sparse 

TT 

Sparse 

TT 

1 

11.54 

1.38 

0.23 

32 

42 

1 

3.75e-3 1.69e-l 

9.58e-3 

1.37e-l 

0 

2 

17.81 

5.14 

0.32 

32 

49 

1 

1.35e-4 l.lOe-1 

4.94e-4 

4.81e-2 

0 

3 

22.72 

1046 

83.12 

32 

49 

462 

6.21e-5 2.00e-3 

2.99e-4 

5.29e-4 

2.75e-4 

4 

28.85 

— 

69.74 

32 

50 

416 

6.24e-5 — 

9.85e-5 

— 

1.21e-4 

5 

36.58 

— 

102.5 

32 

49 

410 

6.27e-5 — 

9.36e-5 

— 

6.20e-4 


The solution stage is more simple, since the mean value preconditioner is quite efficient, 
both for the standard CG method with the sparse set and the AMEn method for the 
TT format. Again, the complexity of the TT solver grows linearly with p. The sparse 
solver works reasonably fast as well, but it cannot be run before the matrix elements are 
computecOl; hence it is also skipped for p = 4,5. 

In Table [3] we present the total CPU times required in both methods to find the solution 
u, the time for computing jl, maximal TT ranks of the coefficient (r^), the solution (r^) 
and the weighted characteristic function (r^), as well as statistical errors in the coefficient 
(Ek) and the solution (Eu). The probability P is presented only for the TT calculation. 
Since P ~ 10000 simulations may be not enough to compute P with the Monte Carlo 

method. Below we present a dedicated test of the Monte Carlo approach. 

5.1.3 Experiment with the KLE dimension M 

The length of the truncated KLE is another crucial parameter of the stochastic PDE. In 
Table H] we compare the TT and Sparse procedures. 

We see that the accuracy of the TT approach is stable w.r.t. M, and the complexity 
grows mildly. Note, however, that the correlation length U = 1 is quite large and yields 
a fast decay of the KLE, such that M. = 20 is actually enough for the accuracy 10^^. 
The TT approach demonstrates stability w.r.t. the overapproximation at M = 30. This 
is not the case for the sparse approach: at high M. the accuracy is lost. This is because 
p = 3 is not enough to transform the covariance dl]) accurately. Since eigenvalues of the 
covariance decay rapidly, higher eigenpairs become strongly perturbed (the eigenvalues can 
even become negative), and a large error propagates to the PCE. In the full set, the maximal 

^It is sometimes advocated to avoid a construction of the matrix and to compute its elements only 
when they are needed in the matrix-vector product. It saves memory, but the computational time will be 
comparatively the same, since it is proportional to the number of operations. 
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Table 4: Performance versus M, log-normal distribution 


M 

GPU time. 

sec. 




h 

K 

h, 

X 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

10 

6.401 

6.143 

1.297 

20 

39 

70 

2.00e-4 

1.71e-l 

3.26e-4 

1.45e-l 

2.86e-4 

15 

12.15 

92.38 

22.99 

27 

42 

381 

7.56e-5 

1.97e-3 

3.09e-4 

5.41e-4 

2.99e-4 

20 

21.82 

1005 

67.34 

32 

50 

422 

6.25e-5 

1.99e-3 

2.96e-4 

5.33e-4 

2.96e-4 

30 

52.92 

48961 

136.5 

39 

50 

452 

6.13e-5 

9.26e-2 

3.06e-4 

5.51e-2 

2.78e-4 


Table 5: Performance versus 1^, log-normal distribution 


Ic 

GPU time. 

sec. 


tu 


h 

K 

h, 

X 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

0.1 

216 

55826 

0.91 

70 

50 

1 

1.98e-2 

1.98e-2 

1.84e-2 

1.84e-2 

0 

0.3 

317 

52361 

41.8 

87 

74 

297 

3.08e-3 

3.51e-3 

2.64e-3 

2.62e-3 

8.59e-31 

0.5 

195 

51678 

58.1 

67 

74 

375 

1.49e-4 

2.00e-3 

2.58e-4 

3.10e-4 

6.50e-33 

1.0 

57.3 

55178 

97.3 

39 

50 

417 

6.12e-5 

9.37e-2 

3.18e-4 

5.59e-2 

2.95e-04 

1.5 

32.4 

49790 

121 

31 

34 

424 

3.24e-5 

2.05e-l 

4.99e-4 

1.73e-l 

7.50e-04 


polynomial order is equal to pM, and the error of the covariance transform is negligible. 

5.1.4 Experiment with the correlation length h 

The Gaussian covariance function yields an exponential decay of the KLE coefficients |59l 
WI\ . but the actual rate is highly dependent on the correlation length |321 ITT] . In this 
experiment, we study the range of lengths from 0.1 to 1.5. In order to have a sufficient 
accuracy for all values of h, we hx the dimension M = 30. The results are presented in the 
same layout as before in Table [S] 

In the TT approach, we see a clear decrease of the computational complexity and the 
error with growing covariance length. This is because the SVD approximation in the TT 
format automatically reduces the storage w.r.t. the latter (less important) variables, if the 
KLE decay is fast enough. The TT errors reflect the amount of information discarded in 
the truncated KLE tail, which is large for small Ic and small otherwise. 

The errors in the sparse approach behave in the same way until I^ = 0.5, but for Ic = 1 
and 1.5 the dimension M = 30 is too large, and the instability w.r.t. the over approximation 
takes place. 

With hxed M, the exceedance probability is very sensitive to the correlation length. 
Truncating the KLE, we reduce the total variance of the random field. For a (quasi)- 
Gaussian distribution, a small perturbation of the variance has a small effect on the integral 
over the peak region, but may have a very large relative effect on the tail region, which 
corresponds to the small exceedance probability. 

5.1.5 Experiment with the dispersion a 

The variance of the normally distributed held y(x, cu) is equal to (J^. Since it enters k 
under the exponential, it inhuences the variance of k signihcantly. In Table [6] we vary cr 
from 0.2 to 1 and track the performance of the methods. As expected, the computational 
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Table 6: Performance versus cr, log-normal distribution 


cr 

CPU time. 

sec. 


Tu 


b 

K 

b. 

X 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

0.2 

15.93 

1008 

0.348 

21 

31 

1 

5.69e-5 

4.76e-5 

4.19e-5 

1.30e-5 

0 

0.4 

18.72 

968.3 

0.341 

29 

42 

1 

6.88e-5 

8.04e-4 

1.40e-4 

2.14e-4 

0 

0.5 

21.23 

970.1 

79.96 

32 

49 

456 

6.19e-5 

2.02e-3 

3.05e-4 

5.45e-4 

2.95e-4 

0.6 

24.08 

961.5 

24.72 

34 

57 

272 

9.12e-5 

4.42e-3 

6.14e-4 

1.16e-3 

2.30e-3 

0.8 

31.69 

969.1 

67.93 

39 

66 

411 

4.40e-4 

8.33e-2 

2.02e-3 

2.90e-2 

8 .02e-2 

1.0 

50.67 

1071 

48.44 

44 

82 

363 

1.73e-3 

4.10e-l 

4.96e-3 

3.08e-l 

9.17e-2 


Table 7; Performance versus R, log-normal distribution. The left column shows the number 
of spatial degrees of freedom (^DoF) for R = 1,2,3. 


#DoF 

CPU time. 

sec 

tK 

Tu 


b 

K 

b, 

X 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

557 

6.40 

6.09 

1.29 

20 

39 

71 

2.00e-4 

1.71e-l 

3.26e-4 

1.45e-l 

2.86e-4 

2145 

8.98 

13.7 

1.17 

20 

39 

76 

1.74e-4 

1.89e-3 

3.33e-4 

5.69e-4 

2.90e-4 

8417 

357 

171 

0.84 

20 

40 

69 

1.65e-4 

1.88e-3 

3.24e-4 

5.64e-4 

2.93e-4 


complexity grows with a, as does the contrast in the coefficient. However, we were able to 
perform all computations for each value of u. 

5.1.6 Experiment with the spatial grid refinement R 

Since the efforts of dealing with the full spatial matrix grow significantly with the grid 
refinement, in this test we limit ourselves to M = 10. The principal observations from 
Table [7] are that the TT rank and the accuracjH are stable w.r.t. the grid refinement. 
Therefore, we may expect that the TT approach will also be efficient for finer grids, if 
we find an efficient way to deal with the spatial dimension. A research on non-intrusive 
stochastic Galerkin methods, addressing this issue, has begun recently [22], and we plan to 
adopt it in the TT framework in future. 

5.1.7 Comparison with the Monte Carlo Method 

For the Monte Carlo test, we prepare the TT solution with parameters p = 5 and M = 30. 
The results are presented in Table [HI In the left part of the table we show the performance 
of the Monte Carlo method with different numbers of simulations: total CPU time (Tjvic), 
errors in the mean and variance of u, and a small exceedance probability with its error. 
The right part contains the results of the TT approach: the aggregated CPU time of 
construction of the coefficient, operator and solution (Tjoive), the time to compute the 
weighted characteristic function (T^^), TT ranks of all data and the probability calculated 
from 

We see that the cost of the TT method is comparable with the cost of 40000 Monte 
Carlo tests. That many realizations already provide a good approximation of the mean, a 
bit less accurate for the variance, but it is by far not sufficient for a confident estimate of 

^Note that the errors are estimated via the Monte Carlo method on the same grids, thus they show the 
accuracy of the PCE approximation, not the spatial discretization. 
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Table 8: Verification of the Monte Carlo method, log-normal distribution 


^mc 

Tmc; sec. 

hu 

Bvar^ 

P* 

Ep 

TT results 

10 ^ 

0.6398 

9.23e-3 

1.49e-l 

0 

oo 

Tsolve 

96.89 sec. 

10 ^ 

6.1867 

1.69e-3 

5.97e-2 

0 

oo 


157.0 sec. 

10 ^ 

6.1801-101 

5.81e-4 

7.12e-3 

4.00e-4 

5.53e-l 


39 

10 ^ 

6.2319-10^ 

2.91e-4 

2.91e-3 

4.10e-4 

5.15e-l 

tu 

50 

10^ 

6.3071-103 

1.23e-4 

9.76e-4 

4.60e-4 

3.51e-l 

P 

432 

6.214e-4 


Table 9; Performance versus p, beta distribution 


P 

CPU time. 

sec. 

tK 

ru 


E 

K 

Eu 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

1 

21.40 

1.382 

0.059 

64 

49 

1 

2.24e-3 

5.13e-2 

1.14e-2 

2.50e-2 

0 

2 

39.87 

5.301 

0.100 

65 

50 

1 

1.92e-4 

5.50e-3 

7.67e-4 

1.28e-3 

0 

3 

57.16 

1000 

70.98 

65 

50 

445 

9.07e-5 

1.76e-3 

5.01e-4 

1.06e-3 

1.88e-4 

4 

76.22 

— 

21.18 

65 

50 

416 

8.81e-5 

— 

1.41e-4 

— 

9.84e-5 

5 

100.6 

— 

119.7 

65 

50 

428 

8.89e-5 

— 

l.lOe-4 

— 

1.23e-4 


the exceedance probability. Therefore, the tensor product methods can be recommended 
as a competitive alternative to classical techniques for computing exceedance probabilities. 


5.2 Beta distribution 

The Hermite expansion ([3]) of the exp function in the log-normal case yields the coefficients 
of the form = p- Therefore, the PCE coefficient formula ([7]) resolves to a direct product 
of univariate functions of ai,..., aM, and the tensor format of the PCE can be constructed 
explicitly [T8j. To demonstrate the generality of the cross algorithm, we also consider a 
more exotic beta-distributed coefficient. 


k(x, cu) = ®52 


1 -|- erf 


Yb,cu) 

vT 


-|-1, where !Ba^b(z) = 


1 


B(a,b) J 


fa-lp 


0 


For the purpose of computing (f)t, the function 
Again, the covariance function is coVk(x, p) = exp 


is inverted by the Newton method. 


Since this distribution varies stronger than the log-normal one, for the computation 
of the probability fl3T|) we use larger t = 1.4. All other parameters are the same as in 
the experiments with the log-normal coefficient. The performance of both TT and Sparse 
approach in case of the beta distribution is shown in Tables O [IHl [HI IHl ESI for p, M, b, 
the spatial grid level and the Monte Carlo tests, respectively. 

We see the same behavior as in the log-normal case. The only signihcant difference is 
the lower error of the Sparse method in the case M = 10, = 1, which is 1.08e-3 for the 

beta distribution and 1.45e-l for the log-normal one. 
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Table 10; Performance versus M, beta distribution 


M 

CPU time. 

sec. 

tK 



E 

K 

E, 

X 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

10 

9.777 

5.796 

0.942 

34 

40 

39 

1.70e-4 

1.65e-3 

5.18e-4 

1.08e-3 

1.95e-4 

15 

26.46 

90.34 

25.16 

50 

48 

374 

1.03e-4 

1.73e-3 

4.96e-4 

1.08e-3 

1.94e-4 

20 

56.92 

986.2 

59.57 

65 

50 

413 

9.15e-5 

1.80e-3 

5.08e-4 

1.09e-3 

1.88e-4 

30 

156.7 

55859 

147.9 

92 

50 

452 

7.75e-5 

7.01e-2 

5.12e-4 

4.53e-2 

1.85e-4 


Table 11; Performance versus Ic, beta distribution 


Ic 

CPU time. 

sec. 

Tk 



Ek 

Eu 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

0.1 

665.8 

55923 

0.91 

90 

48 

1 

8.7e-3 

8.77e-3 

7.9e-3 

7.92e-3 

0 

0.3 

2983 

53783 

1.49 

177 

74 

1 

1.5e-3 

2.02e-3 

1.2e-3 

1.30e-3 

0 

0.5 

1138 

54297 

100 

132 

74 

403 

1.5e-4 

1.71e-3 

2.9e-4 

8.21e-4 

2.47e-23 

1.0 

158.8 

56545 

153 

92 

50 

463 

7.8e-5 

6.92e-2 

5.1e-4 

4.47e-2 

1.96e-04 

1.5 

62.20 

55848 

89.5 

75 

42 

409 

6.9e-5 

7.85e-2 

8.3e-4 

4.56e-2 

2.20e-03 


6 Conclusion 

We have developed the new block TT cross algorithm to compute the TT approximation of 
the polynomial chaos expansion of a random held with the tensor product set of polynomials, 
where the polynomial degrees are bounded individually for each random variable. The 
random held can be given as a transformation of a Gaussian held by an arbitrary smooth 
function. The new algorithm builds the TT approximation of the PCE in a black box 
manner. Compared to the previously existing cross methods, the new approach assimilates 
all KLE terms simultaneously, which reduces the computational cost signihcantly. 

The uncertain (dihusion) coefficient in the elliptic PDE is approximated via PCE. We 
show that the tensor product polynomial set allows a very efficient construction of the 
stochastic Galerkin matrix in the TT format, provided the coefficient is precomputed in the 
TT format. Interestingly, we can even compute the Galerkin matrix exactly by preparing 
the coefficient with two times larger polynomial orders than those employed for the solution. 
In the TT format, we can store the Galerkin matrix in the dense form, since the number of 
the TT elements is feasible for p ~ 10. This also means that other polynomial 

families, such as the Chebyshev or Laguerre, may be used straightforwardly. 

The Galerkin matrix dehnes a large linear system on the PCE coefficients of the solution 
of the stochastic equation. We solve this system in the TT format via the alternating 
minimal energy algorithm and calculate the post-processing of the solution, such as the 
mean, variance and exceedance probabilities. 

We demonstrate that with the new TT approach we can go to a larger number of 
random variables (e.g. M = 30) used in the PCE (larger stochastic dimension) and take a 
larger order of the polynomial approximation in the stochastic space (p = 5) on a usual PC 
desktop computer. For all stages of numerical experiments (computation of the coefficient, 
operator, solution and statistical functionals) we report the computational times and the 
storage costs (TT ranks), and show that they stay moderate in the investigated range of 
parameters. 
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Table 12: Performance versus R, beta distribution 


#DoF 

GPU time. 

sec 

tK 

ru 


h 

K 

hu 

P 


TT 

Sparse 



TT 

Sparse 

TT 

Sparse 

TT 

557 

9.73 

5.94 

0.94 

34 

40 

39 

1.70e-4 

1.65e-3 

5.21e-4 

1.08e-3 

1.95e-4 

2145 

36.2 

12.7 

0.77 

34 

41 

41 

1.56e-4 

1.64e-3 

5.19e-4 

1.08e-3 

1.97e-4 

8417 

378 

162 

1.12 

34 

40 

43 

1.53e-4 

1.62e-3 

5.07e-4 

1.06e-3 

1.96e-4 


Table 13: Verification of the Monte Carlo method, beta distribution 


^mc 

Tmc; sec. 

hu 

hvaru 

P* 

hp 

TT results 

10 ^ 

0.9428 

9.12e-3 

1.65e-l 

0 

oo 

Tsolve 

278.4014 sec. 

10 ^ 

9.5612 

1.04e-3 

6.04e-2 

0 

oo 


179.4764 sec. 

10 ^ 

8.849-10' 

4.38e-4 

5.56e-3 

0 

oo 


92 

10 ^ 

8.870-10^ 

2.49e-4 

3.06e-3 

7.00e-5 

6.80e-l 

tu 

50 


8.883-10^ 

8.16e-5 

8.56e-4 

1.07e-4 

9.94e-2 

P 

406 

1.1765e-04 


In particular, the TT ranks do not grow with the polynomial degree p. This remains in 
sharp contrast to the traditional sparse polynomial approximation, where the total polyno¬ 
mial degree is bounded. The cardinality of this sparse polynomial set grows exponentially 
with p, but the tensor product decomposition is not possible anymore. This renders the to¬ 
tal computational cost of the sparse PCE approach higher than the cost of the TT method. 
Besides, the tensor product PCE is more accurate than the expansion in the sparse set due 
to a larger total polynomial degree. Comparison with the classical Monte Carlo method 
shows that the TT methods can compute the exceedance probabilities more accurately, 
since the TT format approximates the whole stochastic solution implicitly. 

Several directions of research can be pursued in the future. 

Currently, we store both the matrix and the inverted mean-held preconditioner in the 
dense form. This imposes rather severe restrictions on the spatial discretization. The spatial 
part of the Galerkin matrix must be dealt with in a more efficient way. 

With the tensor product methods the stochastic collocation approach seems very at¬ 
tractive |33]- We may introduce quite large discretization grids in each random variable 
Bra- additional data compression can be achieved with the QTT approximation It is 
important that the deterministic problems are decoupled in the stochastic collocation. The 
cross algorithms can become an efficient non-intrusive approach to stochastic equations. 
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